A benchmark testing framework for evaluating the intelligence of large language models in complex social games, inspired by the game "Werewolf".
Openai
-
Input tokens/M
Output tokens/M
Context Length
Tencent
Bytedance
$0.8
$8
256
Alibaba
$1.8
$5.4
16
$3
$9
Anthropic
$105
$525
200
Stepfun
Deepseek
$2
8
32
Baichuan
$3.5
$7
4
Minimax
Moonshot
$10
$30
131
01-ai